Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
1.
medrxiv; 2023.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2023.10.12.23296948

ABSTRACT

Background: The Global Burden of Disease study has provided key evidence to inform clinicians, researchers, and policy makers across common diseases, but no similar effort with single study design exists for hundreds of rare diseases. Consequently, many rare conditions lack population-level evidence including prevalence and clinical vulnerability. This has led to the absence of evidence-based care for rare diseases, prominently in the COVID-19 pandemic. Method: This study used electronic health records (EHRs) of more than 58 million people in England, linking nine National Health Service datasets spanning healthcare settings for people alive on Jan 23, 2020. Starting with all rare diseases listed in Orphanet, we quality assured and filtered down to analyse 331 conditions with ICD-10 or SNOMED-CT mappings clinically validated in our dataset. We report 1) population prevalence, clinical and demographic details of rare diseases, and 2) investigate differences in mortality with SARs-CoV-2. Findings: Among 58,162,316 individuals, we identified 894,396 with at least one rare disease. Prevalence data in Orphanet originates from various sources with varying degrees of precision. Here we present reproducible age and gender-adjusted estimates for all 331 rare diseases, including first estimates for 186 (56.2%) without any reported prevalence estimate in Orphanet. We identified 49 rare diseases significantly more frequent in females and 62 in males. Similarly we identified 47 rare diseases more frequent in Asian as compared to White ethnicity and 22 with higher Black to white ratios as compared to similar ratios in population controls. 37 rare diseases were overrepresented in the white population as compared to both Black and Asian ethnicities. In total, 7,965 of 894,396 (0.9%) of rare-disease patients died from COVID-19, as compared to 141,287 of 58,162,316 (0.2%) in the full study population. Eight rare diseases had significantly increased risks for COVID-19-related mortality in fully vaccinated individuals, with bullous pemphigoid (8.07[3.01-21.62]) being worst affected. Interpretation: Our study highlights that National-scale EHRs provide a unique resource to estimate detailed prevalence, clinical and demographic data for rare diseases. Using COVID-19-related mortality analysis, we showed the power of large-scale EHRs in providing insights to inform public health decision-making for these often neglected patient populations.


Subject(s)
COVID-19 , Pemphigoid, Bullous , Rare Diseases , Disease
2.
arxiv; 2022.
Preprint in English | PREPRINT-ARXIV | ID: ppzbmed-2204.09781v3

ABSTRACT

The COVID-19 pandemic has been severely impacting global society since December 2019. Massive research has been undertaken to understand the characteristics of the virus and design vaccines and drugs. The related findings have been reported in biomedical literature at a rate of about 10,000 articles on COVID-19 per month. Such rapid growth significantly challenges manual curation and interpretation. For instance, LitCovid is a literature database of COVID-19-related articles in PubMed, which has accumulated more than 200,000 articles with millions of accesses each month by users worldwide. One primary curation task is to assign up to eight topics (e.g., Diagnosis and Treatment) to the articles in LitCovid. Despite the continuing advances in biomedical text mining methods, few have been dedicated to topic annotations in COVID-19 literature. To close the gap, we organized the BioCreative LitCovid track to call for a community effort to tackle automated topic annotation for COVID-19 literature. The BioCreative LitCovid dataset, consisting of over 30,000 articles with manually reviewed topics, was created for training and testing. It is one of the largest multilabel classification datasets in biomedical scientific literature. 19 teams worldwide participated and made 80 submissions in total. Most teams used hybrid systems based on transformers. The highest performing submissions achieved 0.8875, 0.9181, and 0.9394 for macro F1-score, micro F1-score, and instance-based F1-score, respectively. The level of participation and results demonstrate a successful track and help close the gap between dataset curation and method development. The dataset is publicly available via https://ftp.ncbi.nlm.nih.gov/pub/lu/LitCovid/biocreative/ for benchmarking and further development.


Subject(s)
COVID-19
3.
medrxiv; 2021.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2021.11.08.21265312

ABSTRACT

Background: Updatable understanding of the onset and progression of individuals COVID-19 trajectories underpins pandemic mitigation efforts. In order to identify and characterize individual trajectories, we defined and validated ten COVID-19 phenotypes from linked electronic health records (EHR) on a nationwide scale using an extensible framework. Methods: Cohort study of 56.6 million people in England alive on 23/01/2020, followed until 31/05/2021, using eight linked national datasets spanning COVID-19 testing, vaccination, primary & secondary care and death registrations data. We defined ten COVID-19 phenotypes reflecting clinically relevant stages of disease severity using a combination of international clinical terminologies (e.g. SNOMED-CT, ICD-10) and bespoke data fields; positive test, primary care diagnosis, hospitalisation, critical care (four phenotypes), and death (three phenotypes). Using these phenotypes, we constructed patient trajectories illustrating the transition frequency and duration between phenotypes. Analyses were stratified by pandemic waves and vaccination status. Findings: We identified 3,469,528 infected individuals (6.1%) with 8,825,738 recorded COVID-19 phenotypes. Of these, 364,260 (11%) were hospitalised and 140,908 (4%) died. Of those hospitalised, 38,072 (10%) were admitted to intensive care (ICU), 54,026 (15%) received non-invasive ventilation and 21,404 (6%) invasive ventilation. Amongst hospitalised patients, first wave mortality (30%) was higher than the second (23%) in non-ICU settings, but remained unchanged for ICU patients. The highest mortality was for patients receiving critical care outside of ICU in wave 1 (51%). 13,083 (9%) COVID-19 related deaths occurred without diagnoses on the death certificate, but within 30 days of a positive test while 10,403 (7%) of cases were identified from mortality data alone with no prior phenotypes recorded. We observed longer patient trajectories in the second pandemic wave compared to the first. Interpretation: Our analyses illustrate the wide spectrum of severity that COVID-19 displays and significant differences in incidence, survival and pathways across pandemic waves. We provide an adaptable framework to answer questions of clinical and policy relevance; new variant impact, booster dose efficacy and a way of maximising existing data to understand individuals progression through disease states.


Subject(s)
COVID-19 , Death
4.
medrxiv; 2021.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2021.03.16.21253371

ABSTRACT

COVID-19 is a disease with vast impact, yet much remains unclear about patient outcomes. Most approaches to risk prediction of COVID-19 focus on binary or tertiary severity outcomes, despite the heterogeneity of the disease. In this work, we identify heterogeneous subtypes of COVID-19 outcomes by considering ‘axes’ of prognosis. We propose two innovative clustering approaches − ‘Layered Axes’ and ‘Prognosis Space’ – to apply on patients’ outcome data. We then show how these clusters can help predict a patient’s deterioration pathway on their hospital admission, using random forest classification. We illustrate this methodology on a cohort from Wuhan in early 2020. We discover interesting subgroups of poor prognosis, particularly within respiratory patients, and predict respiratory subgroup membership with high accuracy. This work could assist clinicians in identifying appropriate treatments at patients’ hospital admission. Moreover, our method could be used to explore subtypes of ‘long COVID’ and other diseases with heterogeneous outcomes.


Subject(s)
COVID-19 , Space Motion Sickness , Long QT Syndrome
5.
arxiv; 2020.
Preprint in English | PREPRINT-ARXIV | ID: ppzbmed-2010.15728v4

ABSTRACT

Diagnostic or procedural coding of clinical notes aims to derive a coded summary of disease-related information about patients. Such coding is usually done manually in hospitals but could potentially be automated to improve the efficiency and accuracy of medical coding. Recent studies on deep learning for automated medical coding achieved promising performances. However, the explainability of these models is usually poor, preventing them to be used confidently in supporting clinical practice. Another limitation is that these models mostly assume independence among labels, ignoring the complex correlation among medical codes which can potentially be exploited to improve the performance. We propose a Hierarchical Label-wise Attention Network (HLAN), which aimed to interpret the model by quantifying importance (as attention weights) of words and sentences related to each of the labels. Secondly, we propose to enhance the major deep learning models with a label embedding (LE) initialisation approach, which learns a dense, continuous vector representation and then injects the representation into the final layers and the label-wise attention layers in the models. We evaluated the methods using three settings on the MIMIC-III discharge summaries: full codes, top-50 codes, and the UK NHS COVID-19 shielding codes. Experiments were conducted to compare HLAN and LE initialisation to the state-of-the-art neural network based methods. HLAN achieved the best Micro-level AUC and $F_1$ on the top-50 code prediction and comparable results on the NHS COVID-19 shielding code prediction to other models. By highlighting the most salient words and sentences for each label, HLAN showed more meaningful and comprehensive model interpretation compared to its downgraded baselines and the CNN-based models. LE initialisation consistently boosted most deep learning models for automated medical coding.


Subject(s)
COVID-19
6.
medrxiv; 2020.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2020.06.10.20127175

ABSTRACT

Background: Cardiovascular diseases(CVD) increase mortality risk from coronavirus infection(COVID-19), but there are concerns that the pandemic has affected supply and demand of acute cardiovascular care. We estimated excess mortality in specific CVDs, both direct, through infection, and indirect, through changes in healthcare. Methods: We used population-based electronic health records from 3,862,012 individuals in England to estimate pre- and post-COVID-19 mortality risk(direct effect) for people with incident and prevalent CVD. We incorporated: (i)pre-COVID-19 risk by age, sex and comorbidities, (ii)estimated population COVID-19 prevalence, and (iii)estimated relative impact of COVID-19 on mortality(relative risk, RR: 1.5, 2.0 and 3.0). For indirect effects, we analysed weekly mortality and emergency department data for England/Wales and monthly hospital data from England(n=2), China(n=5) and Italy(n=1) for CVD referral, diagnosis and treatment until 1 May 2020. Findings: CVD service activity decreased by 60-100% compared with pre-pandemic levels in eight hospitals across China, Italy and England during the pandemic. In China, activity remained below pre-COVID-19 levels for 2-3 months even after easing lockdown, and is still reduced in Italy and England. Mortality data suggest indirect effects on CVD will be delayed rather than contemporaneous(peak RR 1.4). For total CVD(incident and prevalent), at 10% population COVID-19 rate, we estimated direct impact of 31,205 and 62,410 excess deaths in England at RR 1.5 and 2.0 respectively, and indirect effect of 49932 to 99865 excess deaths. Interpretation: Supply and demand for CVD services have dramatically reduced across countries with potential for substantial, but avoidable, excess mortality during and after the COVID-19 pandemic.


Subject(s)
COVID-19 , Coronavirus Infections , Cardiovascular Diseases
7.
ssrn; 2020.
Preprint in English | PREPRINT-SSRN | ID: ppzbmed-10.2139.ssrn.3590468

ABSTRACT

Background: Accurate risk prediction of clinical outcome would usefully inform clinical decisions and intervention targeting in COVID-19. The aim of this study was to derive and validate risk prediction models for poor outcome and death in adult inpatients with COVID-19. Methods: Model derivation using data from Wuhan, China used logistic regression with death and poor outcome (death or severe disease) as outcomes. Predictors were demographic, comorbidity, symptom and laboratory test variables. The best performing models were externally validated in data from London, UK. Findings: 4.3% of the derivation cohort (n=775) died and 9.7% had a poor outcome, compared to 34.1% and 42.9% of the validation cohort (n=226). In derivation, prediction models based on age, sex, neutrophil count, lymphocyte count, platelet count, C-reactive protein and creatinine had excellent discrimination (death c-index=0.91, poor outcome c-index=0.88), with good-to-excellent calibration. Using two cut-offs to define low, high and very-high risk groups, derivation patients were stratified in groups with observed death rates of 0.34%, 15.0% and 28.3% and poor outcome rates 0.63%, 8.9% and 58.5%. External validation discrimination was good (c-index death=0.74, poor outcome=0.72) as was calibration. However, observed rates of death were 16.5%, 42.9% and 58.4% and poor outcome 26.3%, 28.4% and 64.8% in predicted low, high and very-high risk groups. Interpretation: Our prediction model using demography and routinely-available laboratory tests performed very well in internal validation in the lower-risk derivation population, but less well in the much higher-risk external validation population. Further external validation is needed. Collaboration to create larger derivation datasets, and to rapidly externally validate all proposed prediction models in a range of populations is needed, before routine implementation of any risk prediction tool in clinical care. Funding Statement: HW and HZ are supported by Medical Research Council and Health Data Research UK Grant (MR/S004149/1), Industrial Strategy Challenge Grant (MC_PC_18029) and Wellcome Institutional Translation Partnership Award (PIII054). RD is supported by the National Institute for Health Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London. DMB is funded by a UKRI Innovation Fellowship as part of Health Data Research UK MR/S00310X/1 (https://www.hdruk.ac.uk). KD is supported by LifeArc STOPCOVID award. This work uses data provided by patients and collected by the NHS as part of their care and support. XW is supported by National Natural Science Foundation of China (grant number:81700006). QL is supported by National Key R&D Program (2018YFC1313700), National Natural Science Foundation of China (grant number: 81870064) and the “Gaoyuan” project of Pudong Health and Family Planning Commission (PWYgy2018-06).Declaration of Interests: The authors declare no competing interests.Ethics Approval Statement: The derivation study was approved by the Research Ethics Committee of Shanghai Dongfang Hospital and Taikang Tongji Hospital. The external validation study operated under London South East Research Ethics Committee (reference 18/LO/2048) approval granted to the King’s Electronic Records Research Interface (KERRI).


Subject(s)
Mucocutaneous Lymph Node Syndrome , Cross Infection , COVID-19 , Pyruvate Carboxylase Deficiency Disease
8.
medrxiv; 2020.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2020.04.28.20082222

ABSTRACT

Background Accurate risk prediction of clinical outcome would usefully inform clinical decisions and intervention targeting in COVID-19. The aim of this study was to derive and validate risk prediction models for poor outcome and death in adult inpatients with COVID-19. Methods Model derivation using data from Wuhan, China used logistic regression with death and poor outcome (death or severe disease) as outcomes. Predictors were demographic, comorbidity, symptom and laboratory test variables. The best performing models were externally validated in data from London, UK. Findings 4.3% of the derivation cohort (n=775) died and 9.7% had a poor outcome, compared to 34.1% and 42.9% of the validation cohort (n=226). In derivation, prediction models based on age, sex, neutrophil count, lymphocyte count, platelet count, C-reactive protein and creatinine had excellent discrimination (death c-index=0.91, poor outcome c-index=0.88), with good-to-excellent calibration. Using two cut-offs to define low, high and very-high risk groups, derivation patients were stratified in groups with observed death rates of 0.34%, 15.0% and 28.3% and poor outcome rates 0.63%, 8.9% and 58.5%. External validation discrimination was good (c-index death=0.74, poor outcome=0.72) as was calibration. However, observed rates of death were 16.5%, 42.9% and 58.4% and poor outcome 26.3%, 28.4% and 64.8% in predicted low, high and very-high risk groups. Interpretation Our prediction model using demography and routinely-available laboratory tests performed very well in internal validation in the lower-risk derivation population, but less well in the much higher-risk external validation population. Further external validation is needed. Collaboration to create larger derivation datasets, and to rapidly externally validate all proposed prediction models in a range of populations is needed, before routine implementation of any risk prediction tool in clinical care.


Subject(s)
COVID-19 , Death
9.
medrxiv; 2020.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2020.04.24.20078006

ABSTRACT

BackgroundThe National Early Warning Score (NEWS2) is currently recommended in the United Kingdom for risk stratification of COVID outcomes, but little is known about its ability to detect severe cases. We aimed to evaluate NEWS2 for severe COVID outcome and identify and validate a set of routinely-collected blood and physiological parameters taken at hospital admission to improve the score. MethodsTraining cohorts comprised 1276 patients admitted to Kings College Hospital NHS Foundation Trust with COVID-19 disease from 1st March to 30th April 2020. External validation cohorts included 5037 patients from four UK NHS Trusts (Guys and St Thomas Hospitals, University Hospitals Southampton, University Hospitals Bristol and Weston NHS Foundation Trust, University College London Hospitals), and two hospitals in Wuhan, China (Wuhan Sixth Hospital and Taikang Tongji Hospital). The outcome was severe COVID disease (transfer to intensive care unit or death) at 14 days after hospital admission. Age, physiological measures, blood biomarkers, sex, ethnicity and comorbidities (hypertension, diabetes, cardiovascular, respiratory and kidney diseases) measured at hospital admission were considered in the models. ResultsA baseline model of NEWS2 + age had poor-to-moderate discrimination for severe COVID infection at 14 days (AUC in training sample = 0.700; 95% CI: 0.680, 0.722; Brier score = 0.192; 95% CI: 0.186, 0.197). A supplemented model adding eight routinely-collected blood and physiological parameters (supplemental oxygen flow rate, urea, age, oxygen saturation, CRP, estimated GFR, neutrophil count, neutrophil/lymphocyte ratio) improved discrimination (AUC = 0.735; 95% CI: 0.715, 0.757) and these improvements were replicated across five UK and non-UK sites. However, there was evidence of miscalibration with the model tending to underestimate risks in most sites. ConclusionsNEWS2 score had poor-to-moderate discrimination for medium-term COVID outcome which raises questions about its use as a screening tool at hospital admission. Risk stratification was improved by including readily available blood and physiological parameters measured at hospital admission, but there was evidence of miscalibration in external sites. This highlights the need for a better understanding of the use of early warning scores for COVID. KO_SCPLOWEYC_SCPLOWO_SCPCAP C_SCPCAPO_SCPLOWMESSAGESC_SCPLOWO_LIThe National Early Warning Score (NEWS2), currently recommended for stratification of severe COVID-19 disease in the UK, showed poor-to-moderate discrimination for medium-term outcomes (14-day transfer to ICU or death) among COVID-19 patients. C_LIO_LIRisk stratification was improved by the addition of routinely-measured blood and physiological parameters routinely at hospital admission (supplemental oxygen, urea, oxygen saturation, CRP, estimated GFR, neutrophil count, neutrophil/lymphocyte ratio) which provided moderate improvements in a risk stratification model for 14-day ICU/death. C_LIO_LIThis improvement over NEWS2 alone was maintained across multiple hospital trusts but the model tended to be miscalibrated with risks of severe outcomes underestimated in most sites. C_LIO_LIWe benefited from existing pipelines for informatics at KCH such as CogStack that allowed rapid extraction and processing of electronic health records. This methodological approach provided rapid insights and allowed us to overcome the complications associated with slow data centralisation approaches. C_LI


Subject(s)
COVID-19
SELECTION OF CITATIONS
SEARCH DETAIL